Fast Barrier Synchronization on Shared Fast Ethernet

نویسندگان

  • Giovanni Chiola
  • Giuseppe Ciaccio
چکیده

Shared LAN is presently the most widespread networking technology, due to its extremely low cost and favourable cost/performance ratio. Clusters of Personal Computers (PCs) leveraging shared 100base-T Ethernet may currently ooer the best price/performance in parallel processing. Most numerical parallel algorithms make heavy use of collective communications and especially barrier synchronization. Hence a critical issue on PC clusters is to ooer eecient implementations of such primitives even though using low-cost, non-switched LAN technology. We implemented and studied some simple barrier synchronization protocols atop the Genoa Active Message MAchine (GAMMA), an eecient Active Messages-like communication layer running on a cluster of Pen-tium PCs connected by a 100base-TX Ethernet repeater hub. In the case of synchronized or quasi-synchronized processes issuing a barrier synchronization , an obvious way to avoid collisions on shared 100base-T Ethernet is to use a barrier protocol which explicitly serializes all the inter-process synchronization communications over the LAN. We propose alternative barrier protocols which avoid Ethernet collisions during the synchronization phase without requiring such a full explicit serialization. One of such protocols deenitely outperforms the fully serialized barrier protocol over 100base-T Ethernet as well as the MPI implementations of barrier synchronization on IBM SP2 and Intel Paragon.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Area and Performance Optimization of Barrier Synchronization on Multi-core Network-on-Chips

Barrier synchronization is commonly and widely used to synchronize the execution of parallel processor cores on multi-core Network-on-Chips (NoCs). Since its global nature may cause heavy serialization resulting in large performance penalty, barrier synchronization should be carefully designed to have low latency communication and to minimize overall completion time. Therefore, in the paper, we...

متن کامل

Fast synchronization on shared-memory multiprocessors: An architectural approach

Synchronization is a crucial operation in many parallel applications. Conventional synchronization mechanisms are failing to keep up with the increasing demand for efficient synchronization operations as systems grow larger and network latency increases. The contributions of this paper are threefold. First, we revisit some representative synchronization algorithms in light of recent architectur...

متن کامل

Fast Barrier Synchronization in Wormhole k-ary n-cube Networks with Multidestination Worms1

This paper presents a new approach to implement fast barrier synchronization in wormhole k-ary n-cubes. The novelty lies in using multidestination messages instead of the traditional single destination messages. Two diierent multidestination worm types, gather and broadcasting, are introduced to implement the report and wake-up phases of barrier synchronization , respectively. Algorithms for co...

متن کامل

A New Prediction Oriented Barrier Synchronization on SMP Clusters

Clusters of Symmetric Multiprocessors (CSMP) are becoming an increasingly popular high-performance computing platform due to the commodity availability of multiprocessor nodes, mature SMP operating systems, low-latency, highbandwidth data networks, and superior price-performance ratio. Fast synchronization is crucial to making efficient use of SMP clusters. In this paper, we focus on one kind o...

متن کامل

A Fast Inter-Kernel Communication and Synchronization layer for MetalSVM

In this paper, we present the basic concepts for fast inter-kernel communication and synchronization layer motivated by the realization of a SCC-related shared virtual memory management system, called MetalSVM. This scalable memory management system is implemented in terms of a bare-metal hypervisor, located within a virtualization layer between the SCC’s hardware and actual operating system. I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998